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Abstract 

We construct group codes over two letters (i.e., bases of subgroups of a two-generated 
free group) with special properties. Such group codes can be used for reducing algorith- 
mic problems over large alphabets to algorithmic problems over a two-letter alphabet. 
Our group codes preserve aperiodicity of inverse finite automata. As an application we 
show that the following problems are PSPACE-complete for two-letter alphabets (this was 
previously known for large enough finite alphabets): The intersection-emptiness prob- 
lem for inverse finite automata, the aperiodicity problem for inverse finite automata, and 
the closure- under-radical problem for finitely generated subgroups of a free group. The 
membership problem for 3-generated inverse monoids is PSPACE-complete. 

1 Introduction 

Codes and coding theory are a well-known and important subject. In its most general form, 
a code over an alphabet A is denned to be a subset C of A* such that any concatenation 
of elements of C can be uniquely factored, or "decoded", into a sequence of elements of C. 
Equivalently, the submonoid (C) of A* generated by C is free with base C, i.e., (C) is isomorphic 
to the free monoid C* . As a reference see [5]. Some notation: A* denotes the free monoid over 
A, i.e., the set of all finite sequences ("words") of elements of A, including the empty word. A + 
denotes the free semigroup over A, i.e., the set of all non-empty finite sequences over A. 

For groups one can use the same definition of a code, replacing "free monoid" by "free 
group". In the literature such a code is called a base of a free group. We'll call it group code 
because we will use it in the spirit of information coding. A precise definition of a group code 
appears below. First we need some notation: The free group over a generating set A is denoted 
by FG(A). We use a copy A^ 1 = {a -1 : a G ^4} of A, disjoint from A, to denote the inverses of 
the generators. We denote AUA^ 1 by A . For w = a\ . . .a n with a\, . . . , a n G A , the inverse 
of w is defined to be w~ x = a~ l . . . ar 1 , where (a -1 ) -1 is always replaced by a for all a G A. The 
identity element of ¥G(A) is the empty word, and is denoted by 1. The elements of FG(^4) are 
all the words over the alphabet ^4 ±1 that are reduced, i.e., that contain no subsegment of the 
form a a -1 or a~ l a (for any a G A). In general, for any word w G (^4 ±1 )* we define the reduction 
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ofw to be the word obtained by cancelling all subsegments of the form w w' 1 (with w G (A^ 1 )*) 
iteratively as much as possible, and we denote the resulting reduced word by red(w). For any 
word w we denote its length by \w\. See [12], [TH [8] for background on free groups. 

Any function / : A — > (B^)* can be extended (uniquely) to a group morphism : 
FG(A) -> FG(B) defined by / (G) (a^ . . . <") = red(/(ai) £l . . . J(a„) £ "), where £i,...,£„ G 
{-!,!}• 

Important convention: Throughout this paper we will view the free group FG(A) as a subset 
of the free monoid (A^ 1 )*; namely, FG(A) consists of all the reduced words over A ±l . Of course, 
FG(A) is only a subset of (A^)*, not a submonoid. 

Definition 1.1 Let Lp : A — > (B^ 1 )* be a map whose extension to a free-group morphism 
ip^ : FG{A) -> FG(£) injective. T/ien toe image set ip^ G \A) (c FG(S) C (B ±l )*) is 
called a group code over B, and the elements of cp^(A) are called code words. By our 
convention, FG(B) is a subset of (B ±l )* , and hence a group code is a set of words. 

The injective map ^ g ^\a '■ A — > FG(B) defined by a i— > red((^(a)) ; i.e., the restriction of 
</?( G ) to A, is called a group encoding of A over B. 

The study of free groups and of bases of free groups (i.e., group codes) has a long history 
[T2| [Til IS]. In particular, Nielsen showed in the 1920s that every finitely generated subgroup 
of a free group is itself free and hence has a group code. A little later in the 1920s Schreier 
extended Nielsen's result to all subgroups of a free group. So, group codes can be finite or 
infinite. We note the following however: 

Proposition 1.2 An infinite group code cannot be a regular language, but can be deterministic 
context-free. 

Proof. If an infinite regular group code existed we could apply the Pumping Lemma, so the 
group code would contain all words of the form w n = ux n v (n 6 N), for some fixed words u, x, v, 
with x non-empty. But then the following non-trivial relation would hold among code words: 
w 2 w^ 1 w 2 = w 3 . 

The example {a n ba~ n : n > 0} over the alphabet {a, b}^ 1 , shows that there are infinite group 
codes that are deterministic context-free languages. The set {a n ba~ n : n > 0} is a well-known 
Nielsen basis. □ 

We are interested in group codes over an alphabet of size 2. Just as for the usual codes 
(over monoids), the main purpose of group codes is to translate large alphabets into smaller 
alphabets. This in turn can be used to show that some problems that are hard over large 
alphabets are also hard over a 2-letter alphabet. We will consider the fixed two- letter alphabet 
{a, b} and the inverses a" 1 , b" 1 of these letters. 

Subgroups of a free group are closely related to inverse monoids and inverse finite automata 
[13] . By definition, an inverse finite automaton is a structure A = (Q,X,5,qo,qf) where, 
according to the standard notation in [9], Q is the set of states, go is the start state, and g/ is 
the accept state. For inverse automata, the input alphabet is X U X^ 1 = X , although we 
only mention X explicitly; the designation "inverse" automatically provides the inverse letters. 
The state-transition relation 5 is a partial function 5 : Q x X^ 1 — > Q, and is required to have 
the following property: For each letter x G X, the partial function S(-,x) : q G Q i— > 5(q, x) G Q 
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is injective. Moreover, we require that the partial function 5(-, be the inverse of 5(-, x). We 
represent an inverse finite automaton by its state-graph, in the same way as for ordinary finite 
automata (see [9]), except that we omit the edges labeled by inverse letters. More precisely, 
when 8{p, x) = q (with p, q G Q, x G X) we draw an edge p — -> q; we implicitly also have 

X' 1 _ 

an edge q — >p, but we don't draw that edge. See e.g. [6] for more information on inverse 
automata. 

Let k : X — > ({a, 6} ±1 )* be any group encoding and let A be any inverse finite automaton 
A with input alphabet X. We define the encoded inverse finite automaton k(A), with 
input alphabet {a, b}, by the following two-step construction: 

(1) We replace every edge p—^q of A (with x G X) by a path labeled by k(x); to do this 
we introduce \k(x)\ — 1 new states and \k(x)\ new edges. Implicitly, we now also have the 
inverses of the new edges, thus obtaining a path from q to p labeled by k^x^ 1 ). Let k(A)q be 
the nondeterministic finite automaton obtained so far. 

(2) Starting from k(A)o we apply the fold operation as much as possible. This means that 
any two edges (explicitly drawn or implicit) with a common beginning or end vertex, and with 

identical label in {a, &} ±:L are made equal. For example, if p -—>■ q\ and p q2 are present 
(with e G {—1,1}) then one folding step makes qi equal to q%, and the above two edges become 
equal. See e.g., [H], [T3j , [6] for more information on the very classical fold operation. In 
particular, it is well known that maximal folding produces a unique resulting automaton, which 
does not depend on the folding sequence chosen. We denote this resulting automaton by k(A)] 
it is an inverse automaton if A is an inverse automaton. We denote the transition function of 
k(A) by 8 K . 

In general, for any automaton M. we let L M denote the language accepted by M.. For an 
inverse automaton A = (Q, A, 5, q , q$) we consider the language accepted L A C (A^)*, as well 
as the group language of A, defined as follows: 

Definition 1.3 The group language of a finite inverse automaton A with input alphabet A 
consists of the reduced words (G (A ±l )*) accepted by A; in other words, the group language of 
A is L_a n FG(A). 

Lemma 1.4 For a finite inverse automaton A with input alphabet A the group language L^C\ 
FG(A) = red(L A ). 

Proof. This is Lemma 1.1 in [6]. □ 

Note that by Benois' theorem [3], [1], red(L^) is also accepted by a finite automaton with 
alphabet A . But this automaton cannot be an inverse automaton, except in trivial cases. 
Indeed, an inverse automaton will always accept some non-reduced words (except when L A is 
empty or consists of just the empty word). 

An automaton with involution over the alphabet (A^ 1 )* is an automaton A such that for 

every edge p—^-q with x G (A^)*, of A, q^—^p is also an edge of A. We will always as- 
sume that all automata over the alphabet A ±x are automata with involution. Notice that an 
automaton with involution is deterministic if and only if it is an inverse automaton. 

Let A be any automaton with involution over the alphabet ^4 ±1 . The folded automaton 
p(A) is defined as above by applying some maximal folding sequence to A. This determines an 
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equivalence relation ~ on the states of A by defining two states to be equivalent if they define 
the same state of p(A), that is, if the two states are folded onto one another. Recall that a 
Dyck word over (A ±1 )* is a word that reduces to the identity word in FG(A). The language of 
Dyck words is known to be the smallest language containing the empty word and closed under 
concatenation and the conjugation operation w i— > awa -1 , for all a G A ±Y . 

Lemma 1.5 Let A be an automaton with involution over the alphabet (A^ 1 ). Then states p, q 
of A satisfy p ~ q if and only if there is a Dyck word w such that w labels a path from p to q 
in A. 

Proof. Assume that the reduced automaton p(A) is obtained by a sequence of m foldings. 
Let Ai be the automaton obtained after i foldings, < i < m. There is a corresponding 
equivalence relation ~j on the states of A, and ~o C ~i C ... C ~ m = ~. 

We will prove by induction that if i is the least integer such that p ~j q, then there is a 
Dyck word w that labels a path from p to q in A. This is true if i — since then the empty 
word labels a path from p to itself. 

Assume that if r ~j s then there is a Dyck word labeling a path from r to s in A; and 
assume that p q, but p fa q. Since a folding identifies exactly two states, the {i + l)st 
folding identifies the ~j class of p with that of q. Let [r]^ denote the ~j equivalence class of a 
state r of A. 

Thus there is a ~j equivalence class, X, such that there are edges of Ai, — ^ X and 
X < — [q]~i for some x G A ±l . It is clear that every path in Ai lifts, by "unfolding", to a path 
of A. Thus in A there are states p',q' and states r, s G X such that p' G [p]~ i; q' G [q]~ { and 
pi J!L, r an( j s g' j n j[ Since p ~j p' r^s g ~j </ we have, by induction, Dyck 
words u, v, w that label paths from p to p f , q' to q and r to s respectively in A. Therefore the 
Dyck word uxwx~ l v labels a path from p to q in .4.. 

Conversely, a straightforward induction on the length of a Dyck word w shows that if w 
labels a path from a state p to a state g of .4 then p ~ g. □ 

Corollary 1.6 Let A be an automaton with involution over the alphabet A ±l and let p(A) be the 
reduced automaton of A. Letp, q be states of A. Ifw = ai . . . a n , with G A ±x , 1 < i < n, labels 
a path from \p]^ to [q)~ in p(A), then there are Dyck words u , . . . , u n such that u^ai . . . a n u n 
labels a path from p to q in A. In particular, red(L(.4)) = red(L(p(A))) . 

Proof. There are states p = Po, Pi, ■ ■ ■ , p n — Q of A such that [pi]^ ^> [Pi+i]~ are edges 
of p(A). Since paths in p(A) lift to paths of A, there are states p' ,p'i, ■ ■ ■ ,p' n of A such that 
Pi ~ p\ for < i < n, and such that there are edges p\ p' i+1 of A. By Lemma 11.51 there are 
Dyck words u , . . . ,u n such that p, t — and the first assertion of the corollary follows. 

It is clear that red(L(^4)) C red(L(p(^4))) since paths in A fold to paths in p(A). The 
converse inclusion follows from the first assertion of the corollary if we take w to be a reduced 
word. □ 

We record a special case of the above corollary that is of special interest in this paper in 
the proposition below. 
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Proposition 1.7 Let k : X — > (A^)* be any group encoding, and let k^ m ^ : (X ±x )* — > 
^4±i)* 5 e ^ e corresponding monoid morphism. Let A be an inverse finite automaton with 
alphabet X and let Lj± C (X ±:L )* fre i/ie language it accepts. Then the group language of k(A) 
is red(K^ M '(L^)) . In other words, red(L K (_4)) = red(fi;( M )(L^)) . 

2 Aperiodicity preserving group codes 

Some standard definitions: A monoid M is called aperiodic iff x n+1 = x n for all x G M, for some 
constant n depending only on M. A finite automaton A is called aperiodic iff the syntactic 
monoid of A is aperiodic. 

Let Y be a finite subset of FG(A), and let H = (Y) be the subgroup of FG(A) generated 
by Y. Then we can construct a finite inverse automaton Ah with the following property: A 
reduced word w G FG(A) belongs to H = (Y) iff Ah accepts w. In other words: The group 
language L(Ah) H FG(A) of is H. A construction of Ah goes as follows (see [6], p. 251, 
for more details): Consider cyclic graphs labeled by the elements of Y, and glue these cycles 
together at their origins; if we now pick this common origin as the start and accept state we 
obtain a nondeterministic automaton. Next, we apply maximal folding. The resulting finite 
inverse automaton is Ah- One can show that it only depends on H (not on the originally given 
generating set Y). 

Definition 2.1 A subgroup H of a group G is closed under radical (also called "radical- 
closed", or "pure") iff for all g G G and all N > we have: g N G H implies g G H . 
The radical of H in G is the set \/H = {g G G : there exists N > with g N G H}. 

Closure under radical for subgroups of a free group is intimately connected to aperiodicity 
of inverse automata: 

Lemma 2.2 Let Y be a finite subset ofFG(A). The subgroup H = (Y) of FG( A) generated 
by Y is closed under radical iff the finite inverse automaton Ah is aperiodic. 

Proof. This is Theorem 3.1 in [BJ. □ 

Proposition 2.3 (Transitivity of radical closure). Consider subgroups K < H < G such 
that K is radical- closed in H and H is radical- closed in G. Then K is radical- closed in G. 

Proof. Suppose x G G is such that x n G K, for some integer n > 2. Then x n G H, hence 
x G H, by radical closure of H in G. So we have now x G H and x n G K. This implies that 
x G K, by radical closure of K in H. □ 

Definition 2.4 A group homomorphism h : FG(X) — > FG(A) preserves closure under radical 
iff for every subgroup H o/FG(X) we have: H is closed under radical in FG(X) iff h(H) is 
closed under radical in FG(A). 

A group encoding ip : X — > (A^ 1 )* is said to preserve closure under radical iff the group 
homomorphism : FG(X) — > FG(^4) determined by ip preserves closure under radical. 



5 



Proposition 2.5 Let f : FG(X) — > FG(A) be an injective morphism such that the image 
group Im(f) of f is radical-closed in FG(A). Then for all subgroups H of FG(X) we have: H 
is radical-closed in FG(X) iff f(H) is radical-closed in FG(A). In other words: 

A group encoding ip preserves radical- closure iff Im(ip) (reduced in the free group) is radical- 
closed. 

Proof. Suppose f(H) is radical-closed in FG(A). Then f(H) is also radical-closed in Im(/). 
Hence, since / is an isomorphism between the groups FG(JT) and Im(/), H is radical-closed in 
FG{X). 

Suppose H is radical-closed in FG(X). Then f(H) is radical-closed in Im(/), since / is 
an isomorphism between FG(X) and Im(/). Hence, since Im(/) is radical-closed in FG(A), 
transitivity of radical closure implies that f(H) is also radical-closed in FG(A). □ 

Example: A family of finite aperiodic two-letter group codes of all sizes 

Consider C n = {a l ba~ l : < i < n — 1}, over the alphabet {a, b}^ 1 . It is well known that 
this set has the Nielsen property, hence it is a group code (compare with Ex. 3, Sect. 3.2, p. 
138 in p2]). Moreover, the inverse automaton A given by the following transition table (with 
state set {1,2, ... ,n}, with 1 as both start and accept state) satisfies: 

red(L^) = red((C n )), 

where "red" refers to reduction in FG({a,6}). In other words, the free group red((C n )) is the 
group language of A. 
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The syntactic inverse monoid of A is generated by the identity map, corresponding to the 
letter b, and the partial map i G {1, 2, . . . , n — 1} i — > i + 1 (undefined on n), corresponding 
to the letter a. Since this is a one-generator monoid with zero, satisfying a n = 0, the monoid 
is aperiodic. 

In summary we have: 

Proposition 2.6 For any alphabet X = {x\, x^, ■ ■ ■ , x n } of size n, the map f : Xi i— > a t ~ 1 ba~ l+1 
(1 < i < n) is a group encoding into a two-generated free group that preserves closure under 
radical. 

By combining the above lemmas and propositions we obtain: 

Corollary 2.7 Let f be the group encoding defined in Proposition \2.b\ Let {w\, . . . , w^} be 
any finite set of words C (X ±r )*. Then the subgroup (w\, . . . ,uik) of FG{X) is closed under 
radical iff the subgroup (f(wx), . . . , f{uik)) o/FG({a, b}) is closed under radical. 

Application: Complexity of radical-closure and aperiodicity problems 

Group encodings are log-space computable reductions from large alphabets to small alpha- 
bets. This enables us to show that the problems below about inverse finite automata or about 
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free groups are PSPACE-complete over two-letter alphabets. Previously it was known that they 
are PSPACE-complete over all large enough finite alphabets ([6], Theorem 6.13). 

The aperiodicity problem takes as input a finite automaton and asks whether the language 
accepted by this automaton is aperiodic. S. Cho and D. Huynh [7j showed that the aperiodicity 
problem for general finite automata is PSPACE-complete, and it was shown in [6] (Theorem 
6.13) that the problem remains PSPACE-complete for inverse finite automata (over some finite 
alphabet). 

The radical- closure problem for a free group FG(X) takes as input a list of words W\, . . . , w n 
G FG(X), and asks whether the subgroup (wi, . . . ,w n ) of FG(X) generated by these words is 
closed under radical. It was proved in [6] (Theorem 7.1) that this problem is PSPACE-complete 
for some finite alphabet X. We can now strengthen these results: 

Theorem 2.8 The radical- closure problem for a free group with two generators, and the ape- 
riodicity problem for inverse finite automata over a two-letter alphabet, are P Space- complete. 

Proof. By Corollary 12. 7\ the group encoding / is a reduction of the radical-closure problem 
over any fixed finite alphabet to the radical-closure problem over a two-letter alphabet. It was 
shown in [6] (Theorem 3.6) that the radical-closure problem and the aperiodicity of inverse 
finite automata are polynomial-time reducible to each other; in this reduction, the alphabets 
are preserved. 

Finally, as we saw above, the radical-closure problem is PSPACE-complete over some finite 
alphabet, and is in P Space for all finite alphabets. □ 

3 Other applications of group codes 

As we saw, a group encoding is a log-space computable function from a possibly large alphabet 
problems to a possibly small alphabet. This will enables us to show that the problems below 
about inverse finite automata or about free groups are PSPACE-complete over a two- or three- 
letter alphabet. 

The intersection-emptiness problem for finite automata takes as input a list of finite 
automata A% (i — 1, . . . , k) where k is part of the input, and asks whether the intersection of the 
languages accepted by these automata is empty. For general deterministic finite automata this 
problem was shown to be PSPACE-complete by D. Kozen [10], and for inverse finite automata 
PSPACE-completeness was shown in [6] (Proposition 5.3). 

Theorem 3.1 The intersection-emptiness problem for inverse finite automata over a fixed 
two-letter alphabet is PSPACE-complete. 

Proof. Let At, . . . , A n be inverse finite automata with alphabet A and let Li, . . . , L n C (A^)* 
be the respective languages that they accept. Let / : A — > (B^)* be any group encoding with 
\B\ = 2, and let L[, . . . , L' n C (B^)* be the languages accepted by the inverse finite automata 
f(Ai), • • • , f{A n ) respectively. 
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We claim that L\ n . . . fl L n = iff L[ D . . . D L' n = 0, which shows that / reduces the 
intersection emptiness problem of inverse automata over the alphabet A to the intersection 
emptiness problem of inverse automata over the alphabet B. 

If L\ fl . . . D L n ^ consider w G L\ fl . . . fl L n . By Lemma 11.41 we can assume that w is 
reduced. Then, by Prop. O, red(/(it;)) e L[ D . . . (~) L' n , hence L\ n . . . n L' n ^ 0. 

Conversely, if y G fl . . . fl L' n 0) we can again assume by Lemma [L4l that y is reduced. 
Then by Prop. 11.71 V £ re d(/(-^i)) fl . . . fl red(/(L n )). Since the function F = red(/(.)) : 
FG(^4) — ► FG(fi) is injective (by definition of a group code), it has an inverse function F^ 1 
and we have F~ 1 (y) G L x n . . . D L n . So, L\ H . . . H L n 7^ 0. 

Finally, as we saw above, the intersection-emptiness problem is PSPACE-complete over some 
finite alphabet. So the reduction makes the encoded problems PSPACE-complete over a two- 
letter alphabet. □ 

The membership problem for finite inverse monoids is defined as follows: The input is a 
finite list of injective partial maps /q, fi) . . . , f m on a finite set {1, . . . , 71}. Each fi is described 
by a function table that bijectively maps a subset of {1, . . . , n} to a subset of {1, . . . , n}; entries 
in the table where /j is not defined are blank. The question is whether f can be written as a 
composition of some of the fi and fr 1 (1 < % < m); more rigorously, the question is whether 
/o belongs to the inverse monoid generated by {fi, ■ ■ ■ , f m }- Below we will also consider the 
membership problem for 3-generator finite inverse monoids; here the input consists of four 
injective partial maps fo, f±, f^-, fa, and the question is the same as before (now with m = 3). 

PSPACE-completeness of the membership problem for general functions was shown by 
D. Kozen [TO]. For permutations the problem is in the complexity class NC (hence in P), 
as proved by L. Babai, E. Luks, A. Seress pQ. In [2] M. Beaudry, P. McKenzie, D. Therien 
proved that the membership problem for general functions (not assumed to be injective) re- 
mains PSPACE-complete if the monoid generated by {fi, . . . , f m } is assumed to be in certain 
pseudo-varieties, and is NP-complete or in NP or in P for certain other pseudo- varieties. 

Although inverse monoids are similar to groups in many ways, problems about inverse 
monoids can be much harder than the corresponding problems about groups: 



Theorem 3.2 The membership problem for the class of finite inverse monoids is PSpace- 
complete. The problem remains PSPACE-complete if the finite inverse monoids are required to 
have just three generators. 

Proof. Since we showed that the intersection-emptiness problem is PSPACE-complete for 
inverse finite automata with a two- letter input alphabet, we can apply Kozen's reduction (see 
p. 263 of [ID]). Kozen's proof needs a few changes in order to make his functions injective. 

Let Ai = (Qi, E, 5i, q^^, q^) (i = 1, ■ ■ ■ , k) be a sequence of inverse finite automata, with 
the same two-letter alphabet S = {a, [3}. We can assume that <?f art) ^ g- fin) (see [6J). As the 
set acted on by our partial functions we take S = {01, o 2 } U Ui=i Qi- The functions are defined 
as follows: 

For each a G E, define f a : S -> S by / (tfe) = <$*(%, a) (for q { G Qi), and f a {o 2 ) = o 2 . 
However, f a (oi) is undefined. Also, consider the function f mit : S — > S defined by /init(9i Stdrt ' ) ) = 
^(start) i or 1 — \ anc [ _/" init (oi) = 02, and /i n it is undefined elsewhere. Finally, the "test 
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function" f : S — > S is defined by fo(q\ s ar ) = q\ for i = 1, . . . , k, and /o(oi) = o 2 , and / is 
undefined elsewhere. 

Now it is straightforward to check (exactly as in [10J, p. 263) that fo is generated by 
{Ant, fa, //Jp iff fltl^A ^0. □ 
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